智能论文笔记

FACOS: Finding API Relevant Contents on Stack Overflow with Semantic and Syntactic Analysis

Kien Luong , Mohammad Hadi , Ferdian Thung , Fatemeh Fard , David Lo

分类：人工智能

2021-11-14

收集与特定API方法相关的API示例，用法和提及在诸如堆栈溢出之类的场地上的讨论中不是一个微不足道的问题。它需要努力正确认识讨论是否指的是开发人员/工具正在搜索的API方法。线程的内容包括描述API方法在讨论中的参与和包含API调用的代码片段中的文本段落，可以参考给定的API方法。利用此观察，我们开发FacOS，一种特定于背景算法，可以在讨论中捕获段落和代码片段的语义和语法信息。FACOS将基于语法的单词的分数与来自Codebert的精细调整的预测模型的分数相结合。Facos在F1分数方面将最先进的方法击败了13.9％。

translated by 谷歌翻译

Russia-Ukraine war: Modeling and Clustering the Sentiments Trends of Various Countries

Hamed Vahdat-Nejad , Mohammad Ghasem Akbari , Fatemeh Salmani , Faezeh Azizi , Hamid-Reza Nili-Sani

分类：自然语言处理

2023-01-02

With Twitter's growth and popularity, a huge number of views are shared by users on various topics, making this platform a valuable information source on various political, social, and economic issues. This paper investigates English tweets on the Russia-Ukraine war to analyze trends reflecting users' opinions and sentiments regarding the conflict. The tweets' positive and negative sentiments are analyzed using a BERT-based model, and the time series associated with the frequency of positive and negative tweets for various countries is calculated. Then, we propose a method based on the neighborhood average for modeling and clustering the time series of countries. The clustering results provide valuable insight into public opinion regarding this conflict. Among other things, we can mention the similar thoughts of users from the United States, Canada, the United Kingdom, and most Western European countries versus the shared views of Eastern European, Scandinavian, Asian, and South American nations toward the conflict.

translated by 谷歌翻译

Focal-UNet: UNet-like Focal Modulation for Medical Image Segmentation

MohammadReza Naderi , MohammadHossein Givkashi , Fatemeh Piri , Nader Karimi , Shadrokh Samavi

分类：计算机视觉

2022-12-19

Recently, many attempts have been made to construct a transformer base U-shaped architecture, and new methods have been proposed that outperformed CNN-based rivals. However, serious problems such as blockiness and cropped edges in predicted masks remain because of transformers' patch partitioning operations. In this work, we propose a new U-shaped architecture for medical image segmentation with the help of the newly introduced focal modulation mechanism. The proposed architecture has asymmetric depths for the encoder and decoder. Due to the ability of the focal module to aggregate local and global features, our model could simultaneously benefit the wide receptive field of transformers and local viewing of CNNs. This helps the proposed method balance the local and global feature usage to outperform one of the most powerful transformer-based U-shaped models called Swin-UNet. We achieved a 1.68% higher DICE score and a 0.89 better HD metric on the Synapse dataset. Also, with extremely limited data, we had a 4.25% higher DICE score on the NeoPolyp dataset. Our implementations are available at: https://github.com/givkashi/Focal-UNet

translated by 谷歌翻译

Beyond Digital "Echo Chambers": The Role of Viewpoint Diversity in Political Discussion

Rishav Hada , Amir Ebrahimi Fard , Sarah Shugars , Federico Bianchi , Patricia Rossini , Dirk Hovy , Rebekah Tromble , Nava Tintarev

分类：自然语言处理

2022-12-18

Increasingly taking place in online spaces, modern political conversations are typically perceived to be unproductively affirming -- siloed in so called ``echo chambers'' of exclusively like-minded discussants. Yet, to date we lack sufficient means to measure viewpoint diversity in conversations. To this end, in this paper, we operationalize two viewpoint metrics proposed for recommender systems and adapt them to the context of social media conversations. This is the first study to apply these two metrics (Representation and Fragmentation) to real world data and to consider the implications for online conversations specifically. We apply these measures to two topics -- daylight savings time (DST), which serves as a control, and the more politically polarized topic of immigration. We find that the diversity scores for both Fragmentation and Representation are lower for immigration than for DST. Further, we find that while pro-immigrant views receive consistent pushback on the platform, anti-immigrant views largely operate within echo chambers. We observe less severe yet similar patterns for DST. Taken together, Representation and Fragmentation paint a meaningful and important new picture of viewpoint diversity.

translated by 谷歌翻译

Utilizing distilBert transformer model for sentiment classification of COVID-19's Persian open-text responses

Fatemeh Sadat Masoumi , Mohammad Bahrani

分类：自然语言处理

2022-12-16

The COVID-19 pandemic has caused drastic alternations in human life in all aspects. The government's laws in this regard affected the lifestyle of all people. Due to this fact studying the sentiment of individuals is essential to be aware of the future impacts of the coming pandemics. To contribute to this aim, we proposed an NLP (Natural Language Processing) model to analyze open-text answers in a survey in Persian and detect positive and negative feelings of the people in Iran. In this study, a distilBert transformer model was applied to take on this task. We deployed three approaches to perform the comparison, and our best model could gain accuracy: 0.824, Precision: 0.824, Recall: 0.798, and F1 score: 0.804.

translated by 谷歌翻译

On-device Training: A First Overview on Existing Systems

Shuai Zhu , Thiemo Voigt , JeongGil Ko , Fatemeh Rahimian

分类：机器学习

2022-12-01

The recent breakthroughs in machine learning (ML) and deep learning (DL) have enabled many new capabilities across plenty of application domains. While most existing machine learning models require large memory and computing power, efforts have been made to deploy some models on resource-constrained devices as well. There are several systems that perform inference on the device, while direct training on the device still remains a challenge. On-device training, however, is attracting more and more interest because: (1) it enables training models on local data without needing to share data over the cloud, thus enabling privacy preserving computation by design; (2) models can be refined on devices to provide personalized services and cope with model drift in order to adapt to the changes of the real-world environment; and (3) it enables the deployment of models in remote, hardly accessible locations or places without stable internet connectivity. We summarize and analyze the-state-of-art systems research to provide the first survey of on-device training from a systems perspective.

translated by 谷歌翻译

CliMedBERT: A Pre-trained Language Model for Climate and Health-related Text

B. Jalalzadeh Fard , S. A. Hasan , J. E. Bell

分类：自然语言处理

2022-12-01

Climate change is threatening human health in unprecedented orders and many ways. These threats are expected to grow unless effective and evidence-based policies are developed and acted upon to minimize or eliminate them. Attaining such a task requires the highest degree of the flow of knowledge from science into policy. The multidisciplinary, location-specific, and vastness of published science makes it challenging to keep track of novel work in this area, as well as making the traditional knowledge synthesis methods inefficient in infusing science into policy. To this end, we consider developing multiple domain-specific language models (LMs) with different variations from Climate- and Health-related information, which can serve as a foundational step toward capturing available knowledge to enable solving different tasks, such as detecting similarities between climate- and health-related concepts, fact-checking, relation extraction, evidence of health effects to policy text generation, and more. To our knowledge, this is the first work that proposes developing multiple domain-specific language models for the considered domains. We will make the developed models, resources, and codebase available for the researchers.

translated by 谷歌翻译

Evolutionary Deep Reinforcement Learning for Dynamic Slice Management in O-RAN

Fatemeh Lotfi , Omid Semiari , Fatemeh Afghah

分类：人工智能 | 机器学习 | 神经与进化计算

2022-08-30

需要下一代无线网络以同时满足各种服务和标准。为了解决即将到来的严格条件，开发了具有柔性设计，分解虚拟和可编程组件以及智能闭环控制等特征的新型开放式访问网络（O-RAN）。面对不断变化的情况，O-Ran切片被研究为确保网络服务质量（QoS）的关键策略。但是，必须动态控制不同的网络切片，以避免由环境快速变化引起的服务水平一致性（SLA）变化。因此，本文介绍了一个新颖的框架，能够通过智能提供的提供资源来管理网络切片。由于不同的异质环境，智能机器学习方法需要足够的探索来处理无线网络中最严厉的情况并加速收敛。为了解决这个问题，提出了一种新解决方案，基于基于进化的深度强化学习（EDRL），以加速和优化无线电访问网络（RAN）智能控制器（RIC）模块中的切片管理学习过程。为此，O-RAN切片被表示为Markov决策过程（MDP），然后最佳地解决了资源分配，以使用EDRL方法满足服务需求。在达到服务需求方面，仿真结果表明，所提出的方法的表现优于DRL基线62.2％。

translated by 谷歌翻译

HTML版本

Mismatching-Aware Unsupervised Translation Quality Estimation For Low-Resource Languages

Fatemeh Azadi , Heshaam Faili , Mohammad Javad Dousti

分类：自然语言处理

2022-07-31

翻译质量估计（QE）是预测机器翻译（MT）输出质量的任务，而无需任何参考。作为MT实际应用中的重要组成部分，这项任务已越来越受到关注。在本文中，我们首先提出了XLMRScore，这是一种基于使用XLM-Roberta（XLMR）模型计算的BertScore的简单无监督的QE方法，同时讨论了使用此方法发生的问题。接下来，我们建议两种减轻问题的方法：用未知令牌和预训练模型的跨语性对准替换未翻译的单词，以表示彼此之间的一致性单词。我们在WMT21 QE共享任务的四个低资源语言对上评估了所提出的方法，以及本文介绍的新的英语FARSI测试数据集。实验表明，我们的方法可以在两个零射击方案的监督基线中获得可比的结果，即皮尔森相关性的差异少于0.01，同时在所有低资源语言对中的平均低资源语言对中的无人看管竞争对手的平均水平超过8％的平均水平超过8％。。

translated by 谷歌翻译

Clustering Object-Centric Event Logs

Anahita Farhang Ghahfarokhi , Fatemeh Akoochekian , Fareed Zandkarimi , Wil M. P. van der Aalst

分类：人工智能

2022-07-26

流程挖掘提供了各种算法来根据事件数据分析过程执行。过程发现是过程挖掘技术的最突出类别，旨在从事件日志中发现过程模型，但是，在使用现实生活数据时会导致意大利面模型。因此，已经在传统事件日志（即带有单个情况概念的事件日志）上提出了几种聚类技术，以降低过程模型的复杂性并发现案例的均匀子集。然而，在现实生活中，尤其是在企业对企业（B2B）过程的背景下，流程中涉及多个对象。最近，已经引入了以对象为中心的事件日志（OCEL）来捕获此类过程的信息，并在OCEL的顶部开发了几种过程发现技术。然而，提出的关于真实OCEL的发现技术的输出导致更具信息性但更复杂的模型。在本文中，我们提出了一种基于聚类的方法，用于群集在OCEL中类似对象，以简化所获得的过程模型。使用对实际B2B过程的案例研究，我们证明我们的方法降低了过程模型的复杂性，并生成了对象的相干子集，这些子集有助于最终用户获得对流程的见解。

translated by 谷歌翻译